Publishing Skewed Sensitive Microdata
نویسندگان
چکیده
A highly skewed microdata contains some sensitive attribute values that occur far more frequently than others. Such data violates the “eligibility condition” assumed by existing works for limiting the probability of linking an individual to a specific sensitive attribute value. Specifically, if the frequency of some sensitive attribute value is too high, publishing the sensitive attribute alone would lead to linking attacks. In many practical scenarios, however, this eligibility condition is violated. In this paper, we consider how to publish microdata under this case. A natural solution is “minimally” suppressing “dominating” records to restore the eligibility condition. We show that the minimality of suppression may lead to linking attacks. To limit the inference probability, we propose a randomized suppression solution. We show that this approach has the least expected suppression in a large family of randomized solutions, for a given privacy requirement. Experiments show that this solution approaches the lower bound on the suppression required for this problem.
منابع مشابه
Privacy beyond Single Sensitive Attribute
Publishing individual specific microdata has serious privacy implications. The k-anonymity model has been proposed to prevent identity disclosure from microdata, and the work on -diversity and t-closeness attempt to address attribute disclosure. However, most current work only deal with publishing microdata with a single sensitive attribute (SA), whereas real life scenarios often involve microd...
متن کاملEfficient Techniques for Preserving Microdata Using Slicing
Privacy preserving publishing is the kind of techniques to apply privacy to collected vast amount of data. One of the recent problem prevailing is in the field of data publication. The data often consist of personally identifiable information so releasing such data consists of privacy problem. Several anonymization techniques such as generalization and bucketization have been designed for priva...
متن کاملSLOMS: A Privacy Preserving Data Publishing Method for Multiple Sensitive Attributes Microdata
Multi-dimension bucketization is a typical method to anonymize multiple sensitive attributes. However, the method leads to low data utility when microdata have more sensitive attributes. In addition, the methods do not generalize quasi-identifiers, which make the anonymous data vulnerable to suffer from linked attacks. To address the problems, the paper proposes a SLOMS method. The method verti...
متن کاملPublishing Microdata with a Robust Privacy Guarantee
Today, the publication of microdata poses a privacy threat. Vast research has striven to define the privacy condition that microdata should satisfy before it is released, and devise algorithms to anonymize the data so as to achieve this condition. Yet, no method proposed to date explicitly bounds the percentage of information an adversary gains after seeing the published data for each sensitive...
متن کاملMulti-Privacy Collaborative Data publishing with Efficient Anonymization Techniques
Privacy-preserving in collaborative data publishing provides methods and tools for publishing the data while protecting the sensitive information in the data set. The success of data mining in privacy relies on the information sharing and quality of data in a distributed environment. Several anonymization techniques have been proposed such as bucketization, generalization which does not prevent...
متن کامل